The Empirical Bayes Envelope and Regret Minimization in Competitive Markov Decision Processes
نویسندگان
چکیده
منابع مشابه
Empirical Bayes Estimation in Nonstationary Markov chains
Estimation procedures for nonstationary Markov chains appear to be relatively sparse. This work introduces empirical Bayes estimators for the transition probability matrix of a finite nonstationary Markov chain. The data are assumed to be of a panel study type in which each data set consists of a sequence of observations on N>=2 independent and identically dis...
متن کاملRegret Minimization in Signal Space for Repeated Matrix Games with Partial Observations
The Bayes Envelope of a repeated matrix game traces the maximal payoo rate that a player (say P1) could secure for herself had she known in advance the empirical frequency of P2's actions. It was shown by J. Hannan (1957) that Regret Minimizing policies for P1 exist which asymptotically attain the Bayes Envelope even without such prior knowledge, but assuming a complete observation of the P2's ...
متن کاملA Regret Minimization Approach in Product Portfolio Management with respect to Customers’ Price-sensitivity
In an uncertain and competitive environment, product portfolio management (PPM) becomes more challenging for manufacturers to decide what to make and establish the most beneficial product portfolio. In this paper, a novel approach in PPM is proposed in which the environment uncertainty, competitors’ behavior and customer’s satisfaction are simultaneously considered as the most important criteri...
متن کاملBetter Rates for Any Adversarial Deterministic MDP
We consider regret minimization in adversarial deterministic Markov Decision Processes (ADMDPs) with bandit feedback. We devise a new algorithm that pushes the state-of-theart forward in two ways: First, it attains a regret of O(T ) with respect to the best fixed policy in hindsight, whereas the previous best regret bound was O(T ). Second, the algorithm and its analysis are compatible with any...
متن کاملSample Complexity Bounds of Exploration
Efficient exploration is widely recognized as a fundamental challenge inherent in reinforcement learning. Algorithms that explore efficiently converge faster to near-optimal policies. While heuristics techniques are popular in practice, they lack formal guarantees and may not work well in general. This chapter studies algorithms with polynomial sample complexity of exploration, both model-based...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Math. Oper. Res.
دوره 28 شماره
صفحات -
تاریخ انتشار 2003